Similarities Between Policy Gradient Methods (PGM) in Reinforcement Learning (RL) and Supervised Learning (SL)

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Policy Gradient Methods for Reinforcement Learning with Function Approximation

Function approximation is essential to reinforcement learning, but the standard approach of approximating a value function and determining a policy from it has so far proven theoretically intractable. In this paper we explore an alternative approach in which the policy is explicitly represented by its own function approximator, independent of the value function, and is updated according to the ...

متن کامل

Reinforcement Using Supervised Learning for Policy Generalization

Applying reinforcement learning in large Markov Decision Process (MDP) is an important issue for solving very large problems. Since the exact resolution is often intractable, many approaches have been proposed to approximate the value function (for example, TD-Gammon (Tesauro 1995)) or to approximate directly the policy by gradient methods (Russell & Norvig 2002). Such approaches provide a poli...

متن کامل

RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning

Deep reinforcement learning (deep RL) has been successful in learning sophisticated behaviors automatically; however, the learning process requires a huge number of trials. In contrast, animals can learn new tasks in just a few trials, benefiting from their prior knowledge about the world. This paper seeks to bridge this gap. Rather than designing a “fast” reinforcement learning algorithm, we p...

متن کامل

Rl: Fast Reinforcement Learning via Slow Reinforcement Learning

Deep reinforcement learning (deep RL) has been successful in learning sophisticated behaviors automatically; however, the learning process requires a huge number of trials. In contrast, animals can learn new tasks in just a few trials, benefiting from their prior knowledge about the world. This paper seeks to bridge this gap. Rather than designing a “fast” reinforcement learning algorithm, we p...

متن کامل

Scalable Multitask Policy Gradient Reinforcement Learning

Policy search reinforcement learning (RL) allows agents to learn autonomously with limited feedback. However, such methods typically require extensive experience for successful behavior due to their tabula rasa nature. Multitask RL is an approach, which aims to reduce data requirements by allowing knowledge transfer between tasks. Although successful, current multitask learning methods suffer f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: SSRN Electronic Journal

سال: 2019

ISSN: 1556-5068

DOI: 10.2139/ssrn.3391216